2 - Machine Learning for Physicists [ID:11034]
50 von 801 angezeigt

Okay, good evening. Welcome to the second lecture in the machine learning course. So

regarding some organizational details like when to do the exam, we will discuss this

at the end of this lecture. But now I just want to remind you briefly what we did last

time. So we introduced the general structure of a neural network. It's composed of neurons

that are connected to each other. And there's a bunch of neurons at the bottom of the neural

network that are the input neurons. So you feed in your input values and then layer by

layer you calculate the output values. And the way this is done is shown here just to

remind you. So there are two steps in each piece of the calculation. There's a linear

step and a non-linear step. And so in the linear step you take all these neuron values

in the lowest layer. You take a weighted superposition of these values that's shown here and you feed

it into a neuron in the upper layer. So that would be the value that is here called z.

But then this is not enough. This wouldn't give you any powerful neural network. And

so in addition you just apply a non-linear function that's shown here. So you apply a

non-linear function called f to each of these values that you calculated previously by the

linear operation. And that's it. And then you just proceed step by step, layer by layer

doing the same stuff. Linear superposition, non-linear function, linear superposition,

non-linear function. So then I introduced to you how we would do this in Python. I pointed

out that the way you are operating in the linear step is actually essentially a matrix

vector multiplication. And so in a programming language like Python that has nice linear

algebra capabilities, you can literally apply a matrix to a vector in a single step. And

that would give you the output. And then there's just this additional step of applying the

non-linear function. So we looked at how this would be done in Python. I also visualized

what happens here. So if you have only two input values, two input neurons and one output

neuron, then you can plot these two input neurons in the two-dimensional plane and in

the vertical axis you would have the output value. And here I only plotted the linear

part, which obviously must be a plane. If I have a function of two variables that is

linear, that gives me a plane and such a representation. But then I will apply the non-linear function

and the non-linear function can be any non-linear function. But one famous example is the sigmoid,

which basically cuts off all the values that are smaller than zero and it also suppresses

the values that are larger than zero down to the level of one and it has a smooth transition

between them. And so that is what is shown here. So that would be one step, still relatively

boring, but then you can proceed layer by layer and you get the interesting behavior

that we already observed last time. So there's one extra thing that I have to tell you how

to make things efficient. And that is in the end we will be interested in not only calculating

the output of the network for a single input, but really for hundreds of inputs in parallel.

And this can be done also using this linear algebra matrix vector notation very efficiently.

And so this is what I want to talk about here. And so when we try to get the output of the

network for 100 samples in parallel, we call these 100 samples together a batch. It's a

batch of samples. It's just a set of samples that we want to apply the network to. So I've

drawn the situation here. You would have, say, a network with three input neurons. So

each sample consists of three values. And you want to feed many samples in parallel into

the network without just doing a loop. Of course, you could always loop over all these

samples and produce the corresponding output and then store the output, but you want to

do it without a loop because at least in an interpreted language like Python, a loop would

be terribly inefficient. And so the way to do this is just to expand our arrays. So now

all the arrays, I will go through the details, but all the arrays will acquire an extra index.

This extra index counts the sample. So if we have 100 samples, then this extra index

has values running from 0 to 99. And so this is shown here in some more detail. Usually

we would have one sample that would be a vector of size n in, if n in is the number of input

neurons of my network. So that is what we already discussed. But now if I have many

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:24:31 Min

Aufnahmedatum

2019-05-06

Hochgeladen am

2019-05-07 10:59:03

Sprache

en-US

This is a course introducing modern techniques of machine learning, especially deep neural networks, to an audience of physicists. Neural networks can be trained to perform diverse challenging tasks, including image recognition and natural language processing, just by training them on many examples. Neural networks have recently achieved spectacular successes, with their performance often surpassing humans. They are now also being considered more and more for applications in physics, ranging from predictions of material properties to analyzing phase transitions. We will cover the basics of neural networks, convolutional networks, autoencoders, restricted Boltzmann machines, and recurrent neural networks, as well as the recently emerging applications in physics.

Tags

matrix backpropagation weights samples vector layer input processing gradient function network weight neuron sample batches numlayers approximate biases
Einbetten
Wordpress FAU Plugin
iFrame
Teilen